Prasad Rajesh Posture

Batch: June 2022
Data Analytics with Python

Task : Peform data analysis on India's statewise covid-19 data using various visualization tools.

import libraries

Loading the data

Datasource : https://www.kaggle.com/datasets/prasadposture121/covid19-india-statewise-data

Renaming the data columns

Exploratory Data Analysis

Various data types of the dataset

Columns of the data

Checking fot the null values

Checking for the duplicate values

Shape of the dataset (rows, columns)

Size of the dataset

Basic Information of the dataset

Stastical information of each column

See if there is any correlation between any data columns

Numeric Correlations

Corrrelation using Heatmap

Correlation using Scatter Matrix

Conclusion: It is evident from these plots and the numerical data of correlation that the total number cases and number of discharged people show high positive correlation therefore lot of people are getting recovered from this disease. This is also supported by the negative correlation between death ratio and discharge ratio. Although there is a significant amount of correlation between number deaths and total cases it is because of the exponetial spread of the virus.

Data Visualization

Comparing each attribute of the dataset such as total number of cases, number of active cases, deaths etc with different states / union territories.

Conclusion: Maharashtra has the highest number of total cases and Andman and Nicobar are at the last place. Kerala has the highest number of active cases. Number of discharged people is large in Maharashtra but number of deaths are large as well that is because of the large and dense population in the different cities of Maharashtra. Haryana has the highest active case ratio and Punjab has the highest death ratio but the discharge ratio is almost same for each and every state / union territory. Finally the last plot gives population of each state / union territory.

One can make this plot more interesting by comparing the attributes of states / union territories on the map of India.

The above plots show same comparitive relationship between states/UTs and the attributes like the barplots, but with the map we can easily understand how the virus might have spread across the contry. The limitation of this plot is due smaller size union territories are not easily visible.

Getting the total number of cases,number of active cases , number deaths and discharges across the nation

The End